system_of_knowledgefandomcom-20200216-history
Ontology Development 101: A Guide to Creating Your First Ontology
The original Natalya F. Noy and Deborah L. McGuinness Stanford University, Stanford, CA, 94305 noy@smi.stanford.edu (not valid, 2017) and dlm@ksl.stanford.edu An ontology defines a common vocabulary for researchers who need to share information in a domain Why would someone want to develop an ontology? Some of the reasons are: * To share common understanding of the structure of information among people or software agents * To enable reuse of domain knowledge * To make domain assumptions explicit * To separate domain knowledge from the operational knowledge * To analyze domain knowledge – to structure it better, to see gaps in knowledge, to organize exploration of knowledge. Applications may include “Semantic Web” and personal development, an alternative to formal education. an ontology is a formal explicit description of concepts in a domain of discourse (classes (sometimes called concepts)), properties of each concept describing various features and attributes of the concept (slots (sometimes called roles or properties)), and restrictions on slots (facets (sometimes called role restrictions)). An ontology together with a set of individual instances of classes constitutes a knowledge base. In reality, there is a fine line where the ontology ends and the knowledge base begins Classes are the focus of most ontologies. Classes describe concepts in the domain. For example, a class of wines represents all wines. Slots describe properties of classes and instances In practical terms, developing an ontology includes: * defining classes in the ontology, * arranging the classes in a taxonomic (subclass–superclass) hierarchy, * defining slots (properties) and describing allowed values for these slots, * filling in the values for slots for instances. This is “top-down” approach. But knowledge accumulation is usually from particular to general. Ontology development is necessarily an iterative process. Concepts in the ontology should be close to objects (physical or logical) and relationships in your domain of interest. These are most likely to be nouns (objects) or verbs (relationships) in sentences that describe your domain. Step 1. Determine the domain and scope of the ontology * What is the domain that the ontology will cover? * For what we are going to use the ontology? * For what types of questions the information in the ontology should provide answers? * Who will use and maintain the ontology? One of the ways to determine the scope of the ontology is to sketch a list of questions that a knowledge base based on the ontology should be able to answer, competency questions (Gruninger and Fox 1995). These questions will serve as the litmus test later: Does the ontology contain enough information to answer these types of questions? Do the answers require a particular level of detail or representation of a particular area? Step 2. Consider reusing existing ontologies It is almost always worth considering what someone else has done and checking if we can refine and extend existing sources for our particular domain and task. Reusing existing 6 ontologies may be a requirement if our system needs to interact with other applications that have already committed to particular ontologies or controlled vocabularies There are libraries of reusable ontologies on the Web and in the literature. For example, we can use the Ontolingua ontology library (http://www.ksl.stanford.edu/software/ontolingua/) or the DAML ontology library (http://www.daml.org/ontologies/). There are also a number of publicly available commercial ontologies (e.g., UNSPSC (www.unspsc.org), RosettaNet (www.rosettanet.org), DMOZ (www.dmoz.org)) Step 3. Enumerate important terms in the ontology It is useful to write down a list of all terms we would like either to make statements about or to explain to a user. What are the terms we would like to talk about? What properties do those terms have? What would we like to say about those terms? Step 4. Define the classes and the class hierarchy A top-down development process starts with the definition of the most general concepts in the domain and subsequent specialization of the concepts A top-down development process starts with the definition of the most general concepts in the domain and subsequent specialization of the concepts A combination development process is a combination of the top-down and bottomup approaches: We define the more salient concepts first and then generalize and specialize them appropriately Step 5. Define the properties of classes—slots i.e. further define each class In general, there are several types of object properties that can become slots in an ontology: * “intrinsic” properties such as the flavor of a wine; * “extrinsic” properties such as a wine’s name, and area it comes from; * parts, if the object is structured; these can be both physical and abstract “parts” (e.g., the courses of a meal) * relationships to other individuals; these are the relationships between individual members of the class and other items (e.g., the maker of a wine, representing Step 6. Define the facets of the slot Slot cardinality Slot cardinality defines how many values a slot can have Step 7. Create instances The last step is creating individual instances of classes in the hierarchy. Defining an individual instance of a class requires (1) choosing a class, (2) creating an individual instance of that class, and (3) filling in the slot values 4.1 Ensuring that the class hierarchy is correct ' An “is-a” relation The class hierarchy represents an “is-a” relation: a class A is a subclass of B if every instance of A is also an instance of B Transitivity of the hierarchical relations A subclass relationship is transitive: If B is a subclass of A and C is a subclass of B, then C is a subclass of A '''Evolution of a class hierarchy ' Maintaining a consistent class hierarchy may become challenging as domains evolve. 'Avoiding class cycles ' We should avoid cycles in the class hierarchy. We say that there is a cycle in a hierarchy when some class A has a subclass B and at the same time B is a superclass of A. – In other words, relationship between A and B should be clear. Define each concept clearly. - Creating such a cycle in a hierarchy amounts to declaring that the classes A and B are equivalent: all instances of A are instances of B and all instances of B are also instances of A. Indeed, since B is a subclass of A, all B’s instances must be instances of the class A. Since A is a subclass of B, all A’s instances must also be instances of the class B. '4.2 Analyzing siblings in a class hierarchy ' Siblings in a class hierarchy Siblings in the hierarchy are classes that are direct subclasses of the same class A rule of thumb: If a class has only one direct subclass there may be a modeling problem or the ontology is not complete. If there are more than a dozen subclasses for a given class then additional intermediate categories may be necessary. '4.3 Multiple inheritance ' Most knowledge-representation systems allow multiple inheritance in the class hierarchy: a class can be a subclass of several classes. It is hard to navigate both an extremely nested hierarchy with many extraneous classes and a very flat hierarchy that has too few classes with too much information encoded in slots. Finding the appropriate balance though is not easy. '4.5 A new class or a property value? ' When modeling a domain, we often need to decide whether to model a specific distinction (such as white, red, or rosé wine) as a property value or as a set of classes again depends on the scope of the domain and the task at hand. Rule: If the concepts with different slot values become restrictions for different slots in other classes, then we should create a new class for the distinction. Otherwise, we represent the distinction in a slot value. Rules: If a distinction is important in the domain and we think of the objects with different values for the distinction as different kinds of objects, then we should create a new class for the distinction. A class to which an individual instance belongs should not change often '4.6 An instance or a class? ' Deciding whether a particular concept is a class in an ontology or an individual instance depends on what the potential applications of the ontology are. Deciding where classes end and individual instances begin starts with deciding what is the lowest level of granularity in the representation. The level of granularity is in turn determined by a potential application of the ontology. In other words, what are the most specific items that are going to be represented in the knowledge base? Individual instances are the most specific concepts represented in a knowledge base '''4.7 Limiting the scope The ontology should not contain all the possible information about the domain: you do not need to specialize (or generalize) more than you need for your application (at most one extra level each way) 5 Defining properties 5.1 Inverse slots These two relations, maker and produces, are called inverse relations. Storing the information “in both directions” is redundant. When we know that a wine is produced by a winery, an application using the knowledge base can always infer the value for the inverse relation that the winery produces the wine. However, from the knowledge-acquisition perspective it is convenient t o have both pieces of information explicitly available. This approach allows users to fill in the wine in one case and the winery in another. The knowledge-acquisition system could then automatically fill in the value for the inverse relation insuring consistency of the knowledge base 5.2 Default values ' If a particular slot value is the same for most instances of a class, we can define this value to be a default value for the slot. Then, when each new instance of a class containing this slot is created, the system fills in the default value automatically '''6 What’s in a name? ' Defining naming conventions for concepts in an ontology and then strictly adhering to these conventions not only makes the ontology easier to understand but also helps avoid some common modeling mistakes. There are many alternatives in naming concepts. Often there is no particular reason to choose one or another alternative. However, we need to Define a naming convention for classes and slots and adhere to it. '6.1 Capitalization and delimiters ' First, we can greatly improve the readability of an ontology if we use consistent capitalization for concept names. For example, it is common to capitalize class names and use lower case for slot names (assuming the system is case-sensitive). '6.2 Singular or plural ' A class name represents a collection of objects – use either singular or plural to name a class, but do it consistently '''Conclusion: “The proof is in the pudding”—we can assess the quality of our ontology only by using it in applications for which we designed it. Category:Ontology